A method to detect influencers in social networks based on the combination of amplification factors and content creation

A social network is one of the efficient tools for information propagation. The content is the bridge between the product and its customers. Evaluating the user’s content creation is a valuable feature to improve information spreading on the social network. This paper proposes a method for extracting brand value with influencers by combining the user’s amplification and content creation in influencer marketing. The amplification factors are studied based on the propagation of the posts on the social network in a duration time. Those factors are more valuable than before when using influencer marketing at a determined time. Moreover, the content creation score is also studied to measure content creation based on the passion point with a brand and its quality. The amplification factors and content creation score are combined to analyze posts’ interest in detecting the emerging influent users for a product in the influencer marketing campaign. Using the amplification factors, the passion points, and the content creation score, a system to manage the influencer marketing on Facebook has been constructed and tested in the real-world campaign. The experimental results show that the proposed method’s influencers bring the conversion rate’s efficiency and revenue in the influencer marketing campaign.


Introduction
In the era of industry 4.0, a social network is a convenient tool for conveying information [1][2][3] and helps a brand approaching its targeted customers. Customers can find almost essential information on social networks and pay attention to the brand's information [4,5] its customers. Or in another view, a good content strategy can create an impact on customer interactions and behaviors [20,21]. In addition, for brand's benefits, only content creation ability is inadequate. The content should address the influencers' opinions, especially it will be better if such content shows the passion of influencers regarding the brand or its product and service. Positive content is a prime for massive social exposure. To express the love to a brand, a user regularly has positive posts about it, and their posts attract their audience' interactions.
An exciting post will attract the audience if its content clearly expressed the seeder's sentiment about a specific topic. Summarize to this point, the ability to produce positive content relevance to the business, continuously and consistently, is the major criteria for micro-influencer identification. And, the micro-influencer who satisfy these two requirements are called brand advocate. Thus, we need a solution to identify the brand advocate on social networks from the influencers identification method. The real-world marketing campaign is an effort to test the proposed method for detecting influencers. It gets positive experimental results. The sale of the brand is increased, and the cost of running the marketing strategy is used more effectively. This study proposes a novel approach for identifying influencer on social networks using the amplification factor to evaluate the information propagation and the content creation score to estimate user's ability in creating contents on a social network. Firstly, a graph-base structure of a social network is introduced. This structure stores information of users and their relations to compute amplification factors on the network. Secondly, the content creation score is also studied to measure the user's content creation based on posts' passion and quality. The passion point is a measure to compute user favorites; it is determined based on the sentiment score of the user's posting and his/her activity on a social network. The quality of posts is evaluated through the analysis of the content of posts. Those measures are integrated to estimate the interests of posts. Those measures are summarized in Fig 2. The proposed method has been compared with some recent relevant methods as the baseline. It is also tested in the real world and the experiment shows that the proposed method's influencers deliver a react-to-purchase conversion rate's efficiency and a good return on investment in the influencer marketing campaign.
The next section of this article presents related research for detecting emerging influencers, sentiment analysis, estimating the brand's loving of a user, and measuring content creation's ability. Section 3 proposes some metrics to evaluate the ability of the user's information propagation. Section 4 establishes the measures to compute the content creation score of users. Section 5 presents the method for combining the amplification factors and the content creation score to detect emerging influencers for a specific brand. The proposed method for detecting influencers based on the amplification factors and the content creation score has experimented with within reality. Section 6 shows those results. The conclusion section summarizes the main results and gives some works in the future.

Related work
A social network is a suitable place for viral information. Although there are still fake news and negative impacts on social networks [22][23][24], social media is ideal for spreading positive information. It is also a popular tool to communicate and establish relationships between products/brands with their targeted customers [25]. In digital marketing, influencer marketing uses influencers to viral the information of a specific brand on the social network. Thus, enhancing opinion leaders' affection is crucial to maximizing the influence in business [26]. If those influencers have high-quality posts, their posts will be more attractive and impact targeted audiences more effectively. Hence, identifying influencers, combining amplification factors and evaluating content creation will get essential influencers for a marketing campaign based on influencers.
The identification of prominent users in social networks s is a critical step in speeding up the spread of information, such as marketing applications, or preventing the spread of harmful content [27]. For users on a social network, the measuring of their impact on that network has been studied by many methods [22,28], such as: using association rules [29], nomological network [30], diffusion model [31]. Those methods can be classified as Local Measures [27,32], Short Path-Based Measures [33], Iterative Calculation-Based Measures [34,35], Coreness-Based Measures [36], and Machine-Learning Algorithms [37,38].
The authors in [29] proposed a method for associate learning to determine relationships between users. Those results were used to verify the identification of the most influential users. In [15,30], some relations between value creation practices, brand community markers, and brand loyalty was built using the nomological network. This model is useful in exploring the brand's loyal communities.
Besides, based on diffusion models' properties in [31], influence optimization is studied. This problem's goal is the selection of crucial opinion selecting a large part of a network. Nonetheless, those properties are general to apply to specific problems. In [39], a closeness measure to quantify users' closeness based on interactions was defined. Incorporating this measure into the ranking mechanism is used to build an influence ranking algorithm based on PageRank, called EIRank, to evaluate our algorithm, EIRank. A dataset collected from Twitter is used to evaluate this algorithm.
Another method to recognize opinion leaders on social networks has been studied in [40], called Milestone Rank. It is the combination of selectivity measure and interest measure, which are the selection and engrossment of a user for a topic, respectively, from a set of milestones. However, Milestone Rank does not use amplification factors in a duration time.
The SNet model, which describes two main objects on the social network, such as users and posts, was proposed [31,32]. The SNet model structure represents users' information and actions and the relations between users and posts on a social network. In this paper, using the SNet model, the method for extracting brand value with influencers is proposed by combining the user's amplification and content creation in influencer marketing. The amplification factors are studied based on the propagation of the posts on the social network in a duration time. Those factors are reasonable when using in the run of influencer marketing at a crucial time.
To measure the interest of a post, the attraction of its content needs to be evaluated. An exciting post will absorb many users and spread very fast on the network. It has content to determine a certain topic, and it shows the seeder's attitude distinctly. The current methods, which evaluate a post's content, do not have features analyzing how to write an interesting post or the user's passion for a brand. Thus, they cannot estimate the content creation precisely to detect influencers.
The study in [27] proposed a general framework and a methodology to predict influent users who affect the behavior of other users in a time period. This method is built based on historical interactions that occurred within the online social network groups.
Sentiment analysis is the analysis of sentiments, emotions, and opinions in data [32]. It aims to evaluate the impact of news and social media [41]. The machine learning approach is an effective method for sentiment analysis [33,42]. It also combines language-oriented to analyze the sentiment, such as self-attention neural networks and their improvement [43][44][45]. In [46], the relations between sentiments and the Brazilian stock market movement were constructed based on the Portuguese sentiment analysis by Multilayer Perceptron. Besides, some integrating methods of deep learning-based sentiment analysis models named lexicon were studied, combining two channels CNN-LSTM and branching of the combination CNN and LSTM/BiLSTM branches [47].
The results in [48] used a fuzzy system to design a measure of influence for an individual node in the focal network and the associated networks. The authors in [49] analyze the positive maximization influence of nodes to select the seed set with the most positive influence on the social network. However, those methods are theoretical and difficult to apply in the real-world social network. A social network includes a set of relations between objects on the network, such as users and posts. Ontology is a useful tool for representing the relationships between objects [50,51] and building a searching system for complex information [52,53]. Hence, with its benefits, ontology can be studied to increase the ability to detect influent nodes on the social network.
Passion point is a measure to compute the brand-loving of a user. In [54], this point is computed using some values on the users' posts related to the specific brand. Those values are the total posts about that brand and the average reactions with each post. However, the action of the user on the social network is not mentioned in that research.
In [55], group decision-making is used to analyze discussions on a social network. In an ordinary social network discussion, a set of people disputing a certain problem can be detected by using sentiment analysis techniques. The study in [56] proposed a method to profile influential users on social media platforms. They are divided into three kinds: opinion leader, opinion reverser and topic initiator. Their profiling can reveal the difference between their opinions and dynamic evolution. The findings can support the manager to focus of attention and emotion of influencers. In the context of groups created in social networks, the research in [57] proposed a general framework and a methodology to predict influent users who impact to the behavior of other users in a time period. This method is constructed based on historical interactions that occurred within the group. Nevertheless, those methods only use to extract a set of users; they are not sufficient to retrieve the information for influencers detection.

The proposed measures of information propagation on a social network
In this section, we describe the proposed measures of information propagation on a social network.

Model of social network
The social network includes objects, users and posts, and relations between them [58]. Thus, the structure of this network is represented by a relational model as a graph-based. However, this model needs to be constructed the structure of a concept for representing its information completely.
Definition 3.1 [54]. The structure of a social network is a relational model, which is a tuple (U, P, R), in which, U is a set of users, P is a set of posts, and R is a set of relations between users and posts on this social network. This model is called the SNet model. The structures of each component as follows: (1) U-set: Each u 2 U is a user, its structure has four elements:  (3) R-set: Each relation in R is one of two kinds:

PLOS ONE
where, R U : a set of relations between two users. the content of post p. It includes: + friend � U × U: a user is a friend of another user. + follower � U × U: a user is a friend of another user. R P : a set of relations between a user and a post. It includes: + comment � U × P: a user comments on a post. + share � U × P: a user shares a post. + reaction � U × P × N: a user reacts to a post. Each kind of a reaction is a natural number. Definition 3.2. Given a post p 2 P, the structures of p.Sh, p.Com and p.Interaction are organized as follows: Time is the timestamp of the user v shares the post p} p.Com: Time is the timestamp of the user v comments on the post p} p.Reaction: Time is the timestamp of the user v who reacts on the post p, s is the kind of this reaction} In which, Time is the data type as timestamp. a/ Some metrics of the user u are shown in Table 1.

Amplification factors of a user
where, SU 1 (u) (SU 2 (u) and SU 3 (u)) is the set of users who share u's posts, and those users are friends (followers and unrelated users) of the user u (resp.) CU 1 (u) (CU 2 (u) and CU 3 (u)) is the set of users who comment on u's posts, and those users are friends (followers and unrelated users) of the user u (resp.) ListFollowers, and λ: constant. b/ The influential vector measures the influence of the user u is as follows: The formula of IU(u) as a vector is similar to [36]. However, the determination of each element, Imp(u) and Popularity(u), is improved.
Some conditions: • An unrelated user is only concerned about the post if this post is inspiring and attractive on the social network, so the weight for unrelated users' reactions is higher than the weight for others' reactions. A friend is usually more excited than a follower, so the weight for friends' reactions is lower than the weight for the reactions of followers [17,18]. Thus, we have conditions: α 1 � α 2 � α 3 and β 1 � β 2 � β 3 .
• When a post is shared, the user thinks this post was useful to others; when a post is exciting, the user comments on it; the "like"-pressing may be a habit [17,18]. Thus, we have the condition: 0 < γ � β � α < 1.
where π user is the timestamp when the user reacts, shares, or comments on the post p.

Content creation score
The post's content is very significant to attract audiences engaging in its information. In this section, a measure for estimating the quality of content creation is proposed. This measure is established by the combination of sentiment score and passion point [59,60]. The method in this section was improved from results in [59]. Sentiment score. Sentiment analysis is the classification of human emotions by using techniques of text analysis. The sentiment score measures a personal person's feelings about a

PLOS ONE
specific brand by analyzing words which were used to debate or discuss it. In this section, the sentiment of posts on a social network is analyzed by the sentiment lexicon. The attributes of positivity and negativity are utilized to evaluate the sentiment score of a post.

Definition 4.1 [59]:
The sentiment score of a word ω, denoted SS(ω), is determined as: where posi (and nega) is the positive (and negative) content. The function PI, which indicates the pointwise mutual information, is computed by followed formulas: The sentiment score of post p, denoted SS(p), is computed by the followed formula: The formula to compute passion point. The measure of the user's loving of a brand is called Passion point. In [54], this point's formula is computed by the Wilson score interval method for the binomial proportion confidence interval [61].
Definition 4.4 [54]: Let u 2 U be a user, a brand X. a) The ranking score of the user u with brand X:

PLOS ONE
b) The formula computes the passion point of the user u with brand X [54]: However, the activeness of the user is not mentioned in the Formula (9). In practice, the more a user is interested in the brand, the more he/she has activities related to it. For example, if a certain person loves the brand, he/she will frequently dedicate and contribute to this brand on social media platforms. Hence, a user is more active with a brand; he/she is more passionate, dedication, and contribution to increase the brand value on the social network. The Formula (9) is improved by combining the feature of activities.
Definition 4.5: (Passion point) Let u 2 U be a user, and a brand X.
a) The activeness of the user u with the brand X is computed by: where, n day = the number of report days. b) The passion point, denoted PP X (u), is computed by: The quality of posts. Given a social network F = (U, P, R) as SNet model, and a user u 2 U, a post p 2 P on the social network F. Denote: • word(p): the quantity of words in the post p.
• word pos (p): the quantity of positive words in the post p.
The method for estimating the content quality of the user's posts is proposed in this section. In common practice, the posts which are too short cannot give full information, especially the information about products. They are not useful for influencers to attract their audience by introducing a product. In this study, the posts with a small word are considered as meaningless In this study, the posts with a small word are considered as meaningless in advertising, they must have an appropriate length. Hence, only meaningful posts are considered when evaluating the content quality of the user's posts. In this study, a meaningless post is a post whose words are smaller than the average quantity of words in each post. After excluding meaningless posts, the content quality of posts is determined based on the remaining posts.
The content quality of u's posts, denoted Q(u), is estimated as follows: Step 1: Ascending sorting of posts in u.ListPosts by their number of words.
• Select k posts in u.ListPosts which have the least number of words.

PLOS ONE
• Determine: Step 3: The quality of posts for user u is estimated by: Content creation score. When a user loves a brand, he/she will create some high-quality, attractive posts on a social media platform to acquaint his/her audience with that brand [4,17]. The content creation score estimates a user's ability to attract an audience through his/her post. For a user u, this score is computed by combining of the passion point with a brand X, PP X (u), and the quality of posts' content, Q(u). Definition 4.6: (Content creation score) Let u 2 U be a user and a brand X. The content creation scores of the user u for the brand X, denoted CC X (u), is computed as follows: In which, PP X (u) and Q(u) are determined by (11) and (13), resp. The Eq (14) determines the content creation score by combining the posts' passion and content quality. The value of PP X (u) will increase when the user is passionate about the brand X, so the user will create some high-quality posts to introduce that brand. In the practice, there are users who regularly posting positive contents to a brand, but those contents are nevertheless the same. Besides, the passion point is a user cumulative score on the brand which will be will accumulated gradually through the time of interaction and sharing of information; then, we can underestimate the creativity of these users [21]. The value of Q(u) in (14) performs the quality of posts through positive words. If users have a low content creation, although they use many positive words, those words will be repeated many times. Thus, the role of log(Q(u)) in (14) will omit those repeated positive words in posts.

The combination method for detect influencers on a social network based on content creation
Homophily and social reinforcement are two characteristics of community structure on a social network. Homophily states that comparable individuals engage and share content more frequently than other users [62]. Indeed, users are more likely to bond with those who share similar interests, and various studies have demonstrated that homophily among users has an impact on the predictability of user profiles [63] and that it may be effectively used for link prediction and product suggestion [64]. Social reinforcement is the behavior of one person, which can affect other people who have relations with him/her, such as his/her audiences or friends/ followers of audiences. This section proposes a method for detecting emerging influencers of a given product or brand based on the combination of information propagation and content creation score.

Create the homophily of a determined brand
Homophily means that similar individuals associate with each other more often than others on social networks [65]. Instant advertising and massively targeted advertising both employ the homophily notion to understand how a user's friends influence the predictability of his or her behavior or to promote things. Homophily can be observed in online social networks, but there is difficult to analysis investigate the principle of homophily. The results in [66] show that a simple product of degree and homophily measures can be quite effective in guiding local search. This section presents a method to construct a sub-graph showing a group of users who are fond of the determined brand as homophily. This analyzing uses the passion point and content creation score to evaluate users in social network.
Algorithm for creating of the homophily of a determined brand. Definition 5.1 [32]. Let F = (U, P, R) be a social network as SNet model.
The weighted graph G = (V, E) contains the links between users on the network F, in which V is a set of vertexes representing users in U, and E is a set of weighted edges representing the relations between users. The computing of the weight for each edge e 2 E, denoted w(e), is shown as follows: If follower(u i , u j ), then w(e ij ) = 1.
For p 2 P and u k = p.Seeder: For each u i 2 U and u i 6 ¼ u k do: • If reaction(u i , p, s), then w(e ik ) + = 1.
In this section, a method for building a sub-graph of the graph representing the social network is proposed based on a given brand or product. This method will extract a sub-graph showing a group of users who are fond of the brand. That sub-graph can detect the homophily for the given brand. Input: The specific brand X. Graph G represents the relations between users on social network F = (U, P, R).
Output: A sub-graph of users loving brand X. The followed algorithm presents the constructing of the sub-graph: Step 1: For each user u 2 V of the graph G. Let a constant ω > 0 be the minimum value of the passion point for the brand X.
Check u.ListPosts. If the user u mentioned to brand X in his/her posts.
If PP X (u) � ω, where PP X (u) is determined by the Eq (11): Add the node u into the sub-graph; Goto Step 2; Step 2: Extend to neighbors of the current node.
Add an edge between the current node u, and it is neighbor v into the sub-graph if: Case 1: The neighbor v also mentioned to the brand X.
• Create an edge between user u and the neighbor v with its weight determined as Definition 5.1.
• If the post p of the user v is related to the brand X and that post is shared from a user y = p.Seeder (y 6 ¼ v), make an edge between this neighbor v and the user y.
Case 2: The neighbor v interacts or comments on u's posts related to X.
Step 3: If there are still nodes that have not yet been traversed in the network Goto Step 1.
The complexity of Algorithm 1. When considering users on a social network, the have to adequately numbers of friends, followers and posts on that network. In this section, the algorithm 1 will be estimated its complexity based on those parameters in the assuming that all users have the same about the number of posts, the number of friends and the number of followers.
Given a social network F = (U, P, R) as SNet model, and a brand X. Denote: • n = card(U): number of users on the network, • m: the average number of posts for each user.
• u 1 : the average number of friends for each user.
• u 2 : the average number of followers for each user.
• L X : List of keywords related to the brand X.
Lemma: Given a post p 2P, and a brand X. The complexity for determining the post p related to the brand X is: where, word(p) is the number of words in the post p. Theorem 1: The complexity of the algorithm 1 is: Oðo:cardðL X Þ:n 2 :mÞ ð16Þ where, ω is the average number of words for each post. � Proof: There are two main steps in Algorithm 1: Step 1 and Step 2. + Step 1 of the algorithm 1: For each user u 2 U, we need to do: • Step 1.1: Determine the user u mentioned the brand X in his/her posts or not. where m is the average number of posts for each user.

PLOS ONE
= ω.m, ω is the average number of words for each post. Thus, the Formula (17) can be written as follows: O cardðL X Þ: From Lemma 1, the numbers of friends, followers and posts of user u is u 1 , u 2 , and m respectively, we have the complexity of Case 2 is: By (21) and (22) In practice, with a determined business sector, the list L X is a set of featured keywords for the brand X. Thus, marketing experts in that sector will determine the list L X . Hence, by (25), the complexity of Algorithm 1 is: O(n 2 .m.ω)

The influencers based on the content creation propagation
Content creation propagation on the posts has been represented by user influence and the number of successful propagations based on computing the user's post's quality and the user's passion point.
Definition 5.2: Given a user u 2 U, a post p 2 P, the time window δ, and the brand X. The user u is the seeder of p, u = p.Seeder, and the post p is related to brand X.
A set of users, who propagate the content p in the time window δ with the determined threshold of content creation scores, is determined as follows: where, θ is the threshold of content creation score, I u p ðdÞ and CC X ðvÞ are computed by the Eqs (2) and (14), resp.

PLOS ONE
a/ The user u is more influent than the user v in the time window δ, denoted v � xu, if: i: IUðvÞ � IUðuÞ and AICC X v ðdÞ � AICC X u ðdÞ ii: OR ðPopularityðvÞ; AICC X v ðdÞÞ � ðPopularityðuÞ; AICC X u ðdÞÞ b/ Let a group of users G � U, a user w 2 G is an influential user on F in the time window δ for the brand X if: where μ is a constant, 0 < μ < 1.

Determining of the Influencers on a social network combining the content creation score
Algorithm for determining of the Influencers on a social network. For a given brand, the influencers on the social network can convey the brand's information to target audiences by using the passion point and content creation score. The process for determining those influencers is as the followed algorithm: Let F = (U, P, R) be a social network as the SNet model and a brand X. Algorithm 2 detects the brand X's potential, influential users, who can be selected to run a campaign of influencer marketing on the social network F in the time window δ. Those influencers also can create excellent content to attract their audiences. Step 1: Create a graph G representing relations between users on social network F as Definition 5.1.
Step 2: Using Algorithm 1, construct a sub-graph of G to determine homophily who love the brand X.
This group is denoted G X . • Stage 2: Detect the influencers combining the evaluation of their content creation.
Step 3: For each user u 2 G X , compute the influent metrics of the user u.
• The content creation score CC X (u) as Formula (14).
• The average of interactions based on the content creation for u's posts related to the brand X: AICC X u ðdÞ, is calculated by the Formula (27).
Step 4: Detect the set of emerging influencers in G X as Definition 5.5.
xu}, in which, the relation "� x" was defined as Definition 5.5.
If card(S u (δ)) � μ × card(G X ) then S: = S [ {u}; } Return S is a set of emerging influencers in G X .

The complexity of Algorithm 2. Theorem 2:
The complexity of the algorithm 2 is: In which, π user is the timestamp when the user reacts, shares, or comments on the post p.
For each post p, the complexity of (2) is: O(n 2 ) Because cardðIPCC X p ðdÞÞ � n, the complexity of (27) is: O(n 3 ) From Lemma, the complexity for determining the post p related to the brand X is: Each user u has m posts. For each post of user u, we will check the relation between that post and the brand X, and estimate SP X p ðdÞ through the Formula (27). Thus, by the complexity of (27), we have the complexity of (28) for each user u as follows: Oðm:maxðcardðL X Þ:wordðpÞ; n 3 ÞÞ ¼ OðmaxðcardðL X Þ:wordðpÞ:m; n 3 :mÞÞ � OðmaxðcardðL X Þ:o:m; n 3 :mÞÞ with ω is the average number of words for each post. From (33), the complexity of computing the average of interactions based on posts related to the brand X in the time window δ is: From the formulas (31)(32)(34), the complexity of Stage 2 of Algorithm 2 is: Oðmaxðn; n 2 :m; n 4 :mÞÞ ¼ Oðn 4 :mÞ ð35Þ + Through the complexity of Stage 1 and Stage 2 as (25) and (35), the complexity of Algorithm 2 is as follows:

Testing and experimental results
Nowadays there already exist several companies that provide marketing management tools, which will be covered in more detail in the rest of this subsection, such as: Hiip [67], Viral-Works, [68]. However, due to business purposes, solution providers have never released details of their solutions or revealed detailed statistics. Hence, we aim to design a holistic solution to both publish to the community and empower brands through the entire process from selecting the appropriate influencers, using a more accurate marketing efficiency measurement tool to generating more sales. To demonstrate the effectiveness of this novel system, we compared the effectiveness of Influencer marketing campaigns in which Influencers are identified by our system with the results of actual Influencer marketing campaigns that the brands conducted before.
Our proposed method has been used to detect the influencers of a brand. From the list of brand's consumers, by computing their measure on the social network, the system uses the proposed measures to detect influencers to viral this brand. Those influencers will be the crucial factor in running an influencer marketing campaign for the brand. The work of the system is shown in Fig 3. The method begins by putting together a database of social media users and their posts. A crawling engine will acquire those users from social media. An initial data set will be entered to improve the relevance of crawling users to a specific brand. These initial data might be a list of influencers from prior campaigns, hashtags, groups, or other information that the crawler can use to create the database. Simultaneously, a scoring engine will track the two metrics indicated above, including the amplification factors and the content creation score. These data will be reviewed by business users for their influencer campaign, and they will be regularly monitored and optimized. In addition to these engines, a front-end system for influencers is being developed with the goal of allowing businesses to use gamification to inspire and nurture them. Gamification's use cases can simply be that the better and appealing posts/comments are, the more influencers can be rewarded. The system can also establish an affiliate connection to a company's e-commerce platform, allowing influencers to be judged not just on their amplification and content production, but also on the income generated by customers who bought products after seeing them on social media.
The primary function of this system is to determine how influential people are on social networks, and then to assist businesses in increasing brand recognition and conversion by leveraging these scores through gamification. As a result, this application can be used for a variety of corporate purposes, such as a brand ambassador campaign, staff advocacy campaign, or a review-to-earn, share-to-earn strategy [69]. Influencer marketing appears to be most commonly used to increase brand awareness. However, from a commercial standpoint, the money

PLOS ONE
generated by any marketing campaign is an important metric to track. This section demonstrates how the proposed strategy can be utilized for influencer commerce in addition to boosting awareness on social media. Influencer commerce is a new strategy that brands and marketers are employing to drive leads and sales. This strategy will alter how influencers generate money as well as provide additional options for businesses to make direct sales.

Comparing with SNOL and SP approaches
The SNOL (Social Network Opinion Leaders) score is proposed in the study [70], which is an ensemble of those features using the adjustable parameters. These parameters are identified by using a fuzzy-based algorithm that follows work from [71]. In particular, the SNOL score in [70] was experimented on the dataset collected from Twitter. Since our experiment data is collected from Facebook, there are some efforts to transform the attributes to fit the specification of Facebook data. Firstly, the retweet action in Twitter is defined as the sharing one in Facebook. Secondly, a tweet in Twitter is also understood as a post in Facebook. The rest of features such as focus rate, activeness, authenticity, etc. remain the same meaning.
To detect influencer on Instagram, the work from [72] takes advantage of Social Network Analysis approach. Particularly, they study the spreading (SP) behavior on a structure of knowledge graph. By using the Linear Threshold Model, the algorithm calculates the proportion of nodes reached and the number of days required to reach the limit of the graph. The SP score is defined to detect and measure the influence score of a user by dividing the proportion of nodes reached to the number of days required.
We also demonstrate the algorithms to calculate the SNOL and SP scores based on our dataset. With SNOL approach, the opinion leaders, known as influencers, are detected by k dominant clusters out of N ones using the K-means algorithm on the features. Then, a SVM model is fitted to tune the adjustable parameters. The SNOL score is calculated on our dataset. Since it is calculated based on each topic, in this experiment, the SNOL score is averaged of all current topics of the dataset to get the final SNOL score. With SP approach, this algorithm in our structure of knowledge graph induced, and the SP score is calculated with the Linear Threshold Model. Those scores are compared to the proposed method by the cosine similarity with the baseline engagement score.
The dataset is collected from Facebook from 06-08-2018 to 06-09-2019. There were 18,949 users were crawled, and we removed 15,074 users who cannot collected any posts during the collected time. There are 9,225 remaining users with 312,130 posts and 112,180,524 interactions. Fig 4 compares the similarity scores between the proposed method (called Amplification factors combine content creation score, AFG + CC), SNOL and SP approaches.
This figure shows that the results of the proposed method are different from other methods. Because, the AFG+CC approach focuses to detect micro-influencers for the brand, and other approaches tend to determine celebrities for it. However, the total engagement score of the proposed method is more effective than others when selecting a small group of users (k < 50) and better than the SP approach when expanding the group of users. Those results are shown in Fig 5.

Application in a practical marketing campaign
This section presents the results when applying determined influencers in a practical marketing campaign. Because of the business secret, our customer's brand is called the brand X, and the time window δ is six (06) days. A campaign of influencer marketing was done in February 2020, and it only considers Vietnamese users on Facebook. This campaign was separated into two phases: • Phase 1: From Feb. [12][13][14][15][16][17][18]2020. The customer used 31 micro-influencers for their brand X; our customer determined those influencers by themselves.
Determine influencers by AFG+CC approach. The determination of influencers for the product X in Phase 2 is processed by Algorithm 2.
Stage 1: Using the information of X, a sub-graph representing a group of brand-lovers of X is shown in Fig 6: Stage 2: Through this group, the emerging influencers for brand X in the time window δ = 6 days) are determined using the proposed measures. Using the opinions from the experts and managers in online marketing, the values of parameters in formulas were chosen as follows: • The values of (α 1 , α 2 , α 3 ), (β 1 , β 2 , β 3 ), (γ 1 , γ 2 , γ 3 ) in Table 2 (29) is selected by 0.7, which means a user is a potential, influential user if he/she is more influential than 70% of members in the group G X .
The list of potential influencers is shown in Table 2. Ten users can become influencers for product X to run the influencer marketing campaign of our customers.
Experimental results. In the followed results, we compare the impact of information propagation and interactions with the post related to brand X in two phases. A marketing campaign's effectiveness is evaluated based on the number of clicks on interactions, the conversion rate of clicks to orders, and the revenue. Table 3 and Fig 7 compare the number of interactions on the brand X to other competitors in February 2020. The results show that product X being more interacted than others. Table 4 and Fig 8 show the number of interactions related to the brand X in each phase of this influencer marketing campaign.

PLOS ONE
In the influencer marketing campaign of the product X, Table 5 shows the number of interactions on the posts, and the numbers of clicks, orders in two phases. Fig 9 compares those values between two phases in this campaign.
Although phase 1 has several more interactions than phase 2, both the click per interaction and the conversion rate of phase 2 are better than phase 1. Hence, the revenue of phase 2 is higher than phase 1. In the practice, phase 2 gives more benefits than phase 1 for our customers. Fig 10 shows that the voice of the conversion rate and orders of phase 2 is more massive than phase 1. Besides, the average sale for each influencer in phase 2 is more significant than each influencer in phase 1 (Fig 10B). Thus, the result of phase 2 is more effective than phase 1.  Table 6 only analyzes comments which are interacted on the posts of the customer's influencers by their sentiment in this campaign. Table 6 shows that the rate of positive comments in phase 2 is higher than in phase 1. The rate of negative comments is similar. Because the influencers in phase 2 tend to the brand X, their engaged audiences' interest also tends to the brand X; thus, the rates of not-concerned

PLOS ONE
comments and neutral comments in phase 2 are lower than in phase 1. The post-contents in phase 2 are better and more attractive, getting more positive feedback from audiences. Through the above results, the proposed method is helpful to identify potential influencers for a determined brand. It brings the efficiency of the conversion rate and the revenue in the influencer marketing campaign. After running the experimental campaign in the real world, our customers also give good feedback for our method.

Discussions
The proposed effectively searches the influencers of a product/brand on the Vietnamese social network. Our method is the combination of the measures of information propagation and content creation to determine emerging influencers. This approach is built by the measure of passion point and the technique of sentiment analysis. Moreover, this method can be applied to many products or brands that can be approached on an online social network. When applying this method in another field, we only need to build the corpus of that field for crawling data in that field. The collected data is about the community of users and their activities on the social network. The proposed method has been used to build a system to manage influencer marketing campaigns on the social network [73].
Our method can work well on Facebook; however, some information propagation factors have to change appropriately when applying a social network platform to another social network platform. For example, on Twitter, the point of reactions of a post, react_point(p), needs to be changed when applied. Sentiment analysis is worked based on the corpus of a language.   1 The rate between each kind of sentiment and total comments in phase 1. 2 The rate between each kind of sentiment and total comments in phase 2. https://doi.org/10.1371/journal.pone.0274596.t006

PLOS ONE
Hence, when applying the proposed method in another language, the corpus for that language needs to be constructed.

Conclusion and future work
In this paper, based on the SNet model, the amplification factors of a user are determined. They have used a method for estimating the user's information propagation, which has been improved from [54]. This method is built by using the social pulse for a post in the time window δ. Besides, the method for estimating the user's content creation score on social networks is also proposed. This score is determined by combining the passion point and analysis of the post's content attraction. The passion point is evaluated by the sentiment score of posts and user's activity. The post's content is analyzed by using sentiment lexicons. The content creation score measures the interest of a post to attract interactions from audiences. We have used the measures to evaluate content creation and information propagation; the method for detecting potential influencers has been proposed. This method can detect influencers impacting other users on social networks with a brand or a product. Moreover, those determined influencers also can create engaging posts for their audience. Those influencers are emerging to run the influencer marketing campaign for that brand/product. In the experiment, the proposed method, called AFG + CC, is compared with other approaches, SNOL and SP. The results show that the proposed method detecting micro-influencers for the brand, and other approaches tend to determine celebrities for one. However, the total engagement score of the proposed method is more effective than others when selecting a small group of users (k < 50) and better than the SP approach when expanding the group of users. Moreover, the AFG + CC method is applied to run a real-world influencer marketing campaign. This experiment shows that the influencers, which are detected by our method, are more effective than others. They bring the efficiency of the conversion rate and the revenue in the influencer marketing campaign.
In the future, our method will be tested on other platforms of social networks, such as Twitter [74], Zalo [75]. Moreover, the measure of content creation will be improved to become a general method for evaluating the post's content. The improved method can be applied to increase the effectiveness of a content marketing campaign. Although the SNet model can be applied to represent the structure of social networks, some techniques will also be studied more to process many kinds of collected data, such as images and clips. Those improvements can be applied in other media platforms of social networks, such as Instagram [76], Tiktok [77].
The recognition of consumer behaviors is vital to approach target customers. In further research, the method for determining the changes in behaviors has been studied. This method can combine with content creation and information propagation measures to determine influence diffusion on the social network [78] and establish an effective online marketing strategy for a specific commercial brand [20,79].